Beauchamp et al., 2020. Data from: Stopover refueling, movement, and departure decisions in the White-throated Sparrow: the influence of intrinsic and extrinsic factors during spring migration, Dryad, Dataset

\(~\)
\(~\)
White-throated Sparrows are a species of bird that exhibit 2 different color morphs within the species: a Tan-striped morph (top) and a White-striped morph (bottom). These morphs are a result of a large chromosomal inversion, which not only alter the coloration in the feathers, but also alter some behavioral characteristics, such as aggressiveness.

Tan-striped color morph White-striped color morph

\(~\) The white-striped sparrows have been shown to display more aggressive and territorial behaviors, over the tan-striped individuals, including defending food sources like feeders (Kopachena and Falls, 1993). This may imply that white-striped individuals will have a higher fat weight than their more timid tan-striped counterparts.

In order to analyze this, I ran a binomial generalized-mixed model to look at the relationship between the amount of fat each bird has and the color morph of the bird.

WTSPdata <- read.csv("data/WTSP_data.csv")
WTSPdata <- na.omit(WTSPdata) 

First, here is a quick look at the raw fit of the data on a binomial graph:

ggplot(WTSPdata, aes(fat, morph)) +
  geom_point(color="forestgreen", size=3, alpha=.5) +
  geom_smooth(color="darkgreen") +
  xlab ("Fat (grams)") +
  ylab ("Color Morph") +
  labs(title="White-throated Sparrow Morph and weight", subtitle="Raw Fit", caption="1 = White-striped morph
       0 = Tan-striped morph")+
  theme_bw()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

ggplot(WTSPdata, aes(fat, morph)) +
  geom_point(color="forestgreen", size=3, alpha=.5) +
  geom_smooth(method="glm", method.args=list(family="binomial"(link="logit")), color="darkgreen") + 
  xlab ("Fat (grams)") +
  ylab ("Color Morph") +
  labs(title="White-throated Sparrow Morph and weight", subtitle="Binary GLM", caption="1 = White-striped morph
       0 = Tan-striped morph")+
  theme_bw()
## `geom_smooth()` using formula 'y ~ x'

fit.1 <- glm(morph~fat, data=WTSPdata, binomial(link="logit"))
display(fit.1)
## glm(formula = morph ~ fat, family = binomial(link = "logit"), 
##     data = WTSPdata)
##             coef.est coef.se
## (Intercept)  0.14     0.43  
## fat         -0.02     0.10  
## ---
##   n = 142, k = 2
##   residual deviance = 196.7, null deviance = 196.7 (difference = 0.0)

Using a binned residual plot to verify that the data works in a binomial GLM:

x <- predict(fit.1)
y <- resid(fit.1)
binnedplot(x, y)

The points all fall within ±2 standard errors.

coef(fit.1)
## (Intercept)         fat 
##  0.14085413 -0.02250686
confint(fit.1)
## Waiting for profiling to be done...
##                  2.5 %    97.5 %
## (Intercept) -0.6990445 0.9862898
## fat         -0.2296390 0.1837120
invlogit <- function(x) {1 / ( 1+exp(-x) ) } 
invlogit(coef(fit.1))
## (Intercept)         fat 
##   0.5351554   0.4943735

The logistic link function and binomial distribution will take account of the properties and constraints on the pattern of the mean and variance for binomial count data.

The logistic curve is linear on the logit scale and the coefficients are the regression intercept (0.535) and slope (0.494) of this line.

summary(fit.1)
## 
## Call:
## glm(formula = morph ~ fat, family = binomial(link = "logit"), 
##     data = WTSPdata)
## 
## Deviance Residuals: 
##    Min      1Q  Median      3Q     Max  
## -1.234  -1.203   1.127   1.151   1.194  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.14085    0.42766   0.329    0.742
## fat         -0.02251    0.10474  -0.215    0.830
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 196.74  on 141  degrees of freedom
## Residual deviance: 196.69  on 140  degrees of freedom
## AIC: 200.69
## 
## Number of Fisher Scoring iterations: 3

Checking the assumption that the ratio of the residual deviance to the residual DF (dispersion parameter) is approximately 1:1. It is a little higher based on the summary() output table: (196.7/140=1.4)

Our p-value is 0.83, which indicates that there isn’t a significant relationship between the color morph of the bird and it’s weight.

\(~\)
In this case, the GLM of the binary White-throated Sparrow data suggests no correlation between color morph on the amount of fat a bird may have.

However, this model isn’t the best fit for binomial data. The variables for color morph don’t show any obvious relationship to weight. What if we used a more obvious comparison, such as sex and wing length? Many songbird species are sexually dimorphic, where the male is often physically larger than the female. Let’s try using those variables in a binomial GLM:

ggplot(WTSPdata, aes(wing_length, sex)) +
  geom_point(color="forestgreen", size=3, alpha=.5) +
  geom_smooth(method="glm", method.args=list(family="binomial"(link="logit")), color="darkgreen") + 
  xlab ("Wing Length (mm)") +
  ylab ("Sex") +
  labs(title="White-throated Sparrow Sex to Wing Length", subtitle="Binary GLM", caption="1 = Male
       0 = Female")+
  theme_bw()
## `geom_smooth()` using formula 'y ~ x'

fit.2 <- glm(sex~wing_length, data=WTSPdata, binomial(link="logit"))
display(fit.2)
## glm(formula = sex ~ wing_length, family = binomial(link = "logit"), 
##     data = WTSPdata)
##             coef.est coef.se
## (Intercept) -90.72    16.01 
## wing_length   1.27     0.22 
## ---
##   n = 142, k = 2
##   residual deviance = 58.7, null deviance = 192.1 (difference = 133.4)

Using a binned residual plot to verify that the data works in a binomial GLM:

x <- predict(fit.2)
y <- resid(fit.2)
binnedplot(x, y)

Most points fall within ±2 standard errors.

invlogit <- function(x) {1 / ( 1+exp(-x) ) } 
invlogit(coef(fit.2))
##  (Intercept)  wing_length 
## 3.971169e-40 7.814420e-01
summary(fit.2)
## 
## Call:
## glm(formula = sex ~ wing_length, family = binomial(link = "logit"), 
##     data = WTSPdata)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.67820  -0.18255  -0.05125   0.23702   2.39603  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -90.7243    16.0126  -5.666 1.46e-08 ***
## wing_length   1.2741     0.2248   5.667 1.45e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 192.066  on 141  degrees of freedom
## Residual deviance:  58.651  on 140  degrees of freedom
## AIC: 62.651
## 
## Number of Fisher Scoring iterations: 7

Checking the assumption that the ratio of the residual deviance to the residual DF (dispersion parameter) is approximately 1:1 - (58.7/140.0 = 0.42)

Our p-value is < 0.05, which indicates a significant relationship between the two variables. As wing length increase, the probability of a bird being a male also increases.

\(~\) \(~\)

\(~\)
Grasshopper Sparrow Point-Count Data
\(~\)
Herse, Mark R.; With, Kimberly A.; Boyle, W. Alice (2019), Data from: The importance of core habitat for a threatened species in changing landscapes, Dryad, Dataset, Dataset

\(~\)
\(~\)

Grasshopper Sparrows are a grassland-dependent species, requiring large continuous tracts of pristine prairie to thrive. One might assume that the larger these grasslands are, the more grasshopper sparrows would be able to inhabit that area.

In this data set, point-count surveys were conducted to quantify how many Grasshopper Sparrows were present on variously-sized plots of grassland.

GRSPdata <- read.csv("data/GRSP_data.csv")

\(~\)
A poisson GLM can be used to plot the relationship between the number of birds detected and size of the grassland they’re found on, particularly with a dataset that contains many zeros (as is the case with many point-count datasets, including this one).

ggplot(GRSPdata, aes(ta_800, grsp)) +
  geom_point(color="goldenrod", size=2, alpha=.5) +
  stat_smooth(method = glm, method.args = list(family = poisson(link = "log")), color="chocolate4", size=1.5) +
  labs(title="Grasshopper Sparrow Detections by Grassland Size", subtitle="Poisson GLM") +
  xlab ("Size of grassland (ha)") +
  ylab ("No. of Grasshopper Sparrows Detected")+
  theme_bw()

grsp.glm <- glm(grsp ~ ta_800, data = GRSPdata, family = poisson(link = log))

autoplot(grsp.glm)

The ‘normal Q–Q plot’ for the transformed residuals checks whether the Poisson distribution is appropriate for the distribution of the residuals, and the scale–location plot checks whether the mean–variance relationship is good (patternless).

anova(grsp.glm)
## Analysis of Deviance Table
## 
## Model: poisson, link: log
## 
## Response: grsp
## 
## Terms added sequentially (first to last)
## 
## 
##        Df Deviance Resid. Df Resid. Dev
## NULL                    7229    10045.6
## ta_800  1   2534.4      7228     7511.2

Because the model is fit with maximum likelihood, we get deviance table. Deviance explained by grassland size is 2534 units, unaccounted deviance is 7228 and the total deviance (distance + unaccounted) is 10045.

Checking for over-dispersion: 7511.2/7228.0 = 1.04

ggplot(GRSPdata, aes(x=grsp)) + geom_histogram(binwidth=1, fill="peru") +
  ggtitle("Poisson Distributions With Differing Means")+
  xlab ("No. of Grasshopper Sparrows") +
  ylab ("Frequency") +
  theme_bw()

summary(grsp.glm)
## 
## Call:
## glm(formula = grsp ~ ta_800, family = poisson(link = log), data = GRSPdata)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6742  -0.8966  -0.5042  -0.2087   5.6193  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -3.9353640  0.0869924  -45.24   <2e-16 ***
## ta_800       0.0212380  0.0005096   41.68   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 10045.6  on 7229  degrees of freedom
## Residual deviance:  7511.2  on 7228  degrees of freedom
## AIC: 12072
## 
## Number of Fisher Scoring iterations: 6

P-value < 0.05, which implies a statistically significant effect of grassland size and detecting a higher number of Grasshopper Sparrows.

The expected number of Grasshopper Sparrows that may be predicted in a 200ha grassland is e^(-3.935+0.021x100), or 1.3 birds in every 200ha plot.

exp(-3.935+0.021*200)
## [1] 1.303431
\(~\)
\(~\)